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Abstract 

In this paper, we propose a Bayesian MAP estimator for solving the deconvohition prob- 
lems when the observations are corrupted by Poisson noise. Towards this goal, a proper 
data fidelity term (log-likelihood) is introduced to reflect the Poisson statistics of the 
noise. On the other hand, as a prior, the images to restore are assumed to be positive 
and sparsely represented in a dictionary of waveforms such as wavelets or curvelets. Both 
analysis and synthesis-type sparsity priors are considered. Piecing together the data fi- 
delity and the prior terms, the deconvolution problem boils down to the minimization 
of non-smooth convex functionals (for each prior). We establish the well-posedness of 
each optimization problem, characterize the corresponding minimizers, and solve them by 
means of proximal splitting algorithms originating from the realm of non-smooth convex 
optimization theory. Experimental results are conducted to demonstrate the potential 
applicability of the proposed algorithms to astronomical imaging datasets. 

Keywords: Deconvolution, Poisson noise. Proximal iteration. Iterative thresholding. 
Sparse representations. 



1. Introduction 

Deconvolution is a longstanding problem in many areas of signal and image process- 
ing (e.g. biomedical imaging [3, 31, 33], astronomy [3, 38], remote-sensing, to quote a 
few). For example, research in astronomical image deconvolution has seen considerable 
work, partly triggered by the Hubble space telescope (HST) optical aberration problem 
at the beginning of its mission. In biomedical imaging, researchers are also increasingly 
relying on deconvolution to improve the quality of images acquired using complex op- 
tical systems, like confocal microscopes [31]. Deconvolution may then prove crucial for 
exploiting images and extracting scientific content, and is more and more involved in 
everyday research. 
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An extensive literature exist on the deconvolution problem [1, 25, 41]. In presence 
of Poisson noise, several deconvolution methods have been proposed such as Tikhonov- 
Miller inverse filter and Richardson-Lucy (RL) algorithms; see [33, 38] for comprehensive 
reviews. The RL has been extensively used in many applications, but as it tends to am- 
plify the noise after a few iterations, several extension have been proposed. For example, 
wavelet-regularized RL algorithm has been proposed by several authors [24, 36, 37]. The 
interested reader may refer to [3, 13] for a more complete review on deconvolution with 
Poisson noise. 

In the context of deconvolution with either Gaussian or Poisson noise, sparsity- 
promoting regularization over orthobasis or frame dictionaries has been recently proposed 
[7, 10, 11, 13, 20, 21, 22, 32, 40]. 

For instance, in [13] the authors proposed to stabilize the Poisson noise using the 
Anscombc variance stabilizing transform [2] to bring the problem back to deconvolution 
with additive white Gaussian noise, but at the price of a non-linear data fidelity term. 
However, stabilization has a cost, and the performance of such an approach degrades in 
low intensity regimes. On the other hand, the authors in [32] use directly the data fidelity 
term related to Poisson noise (see Section 3) and exploit its additive structure together 
with some particularities of the composition of the convolution operator with tight frames 
in order to solve the problem using a proximal framework. However, their framework only 
applies to some special convolution kernels. Using the augmented Lagrangian method 
with the alternating method of multipliers algorithm, [21] presented a deconvolution 
algorithm with either a synthesis or an analysis prior. In fact, ADMM is nothing but the 
Douglas-Rachford splitting [15] applied to the Fenchel-Rockafellar dual objective. 

Contributions. In this paper, we propose an image deconvolution algorithm for data 
blurred and contaminated by Poisson noise using analysis and synthesis-type sparsity pri- 
ors. In order to form the data fidelity term, we take the exact Poisson likelihood. Putting 
together the data fidelity and the prior terms, the deconvolution problem is formulated as 
the minimization of a maximum a posteriori (MAP) objective functional involving three 
terms: the data fidelity term: a non-negativity constraint (as Poisson data are positive by 
definition) ; and a regularization term, in the form of a non-smooth penalty that enforces 
the sparsity of the sought after image over a — possibly redundant — dictionary of wave- 
forms. We establish the well-posedness of our optimization problems and characterize the 
corresponding minimizers. We then solve them by means of proximal splitting algorithms 
originating from the field of non-smooth convex optimization theory. More precisely we 
take benefit from a generalization of Douglas-Rachford splitting to product spaces; see 
Section 4.2. In order to use this proximal splitting algorithm, the proximity operator of 
each individual term in the objective functional is established. Experimental results on 
several imaging datascts arc carried out to demonstrate the potential applicability of the 
proposed algorithms and compare them with some competitors. 

Notation and terminology. Let H a real Hilbert space, here a finite dimensional real 
vector space. We denote by [|.|| the norm associated with the inner product (., .) in H, 
and I is the identity operator onH. x and a are respectively reordered vectors of image 
samples and coefiicients. We denote by riC the relative interior of a convex set C. 

A real-valued function / is coercive, if lim||2||_>._|_oo / (z) = -|-oo. The domain of / is 
defined by dom/ = {z G H \ f{z) < +00} and / is proper if dom/ ^ 0. A real-valued 
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function / is lower semi-continuous (Isc) if liminf^^^j, f{z) > /(^o). ro('H) is the class 
of all proper Isc convex functions from H to (—00, +00]. The subdifferential of a function 
/ e To{n) at zgU is the set df{z) = {u€H\\ly€H, f{y) ^ f{z) + {u,y- z)}. Its 
Legendre-Fenchel conjugate is /*. We denote by tc the indicator of the convex set 

C: ic{z) = \ ii z € C , ^ j2gj ^ more comprehensive account on convex 

1+00, otherwise. 

analysis. 

We denote by |||M||| = max^^o ^^^^ the spectral norm of M. 
2. Sparsity prior 

Let X € M" be an y/nx ^Jn image. We denote by # the dictionary, i.e. the n x L matrix 
whose columns are the generating waveforms (called atoms) (</5i)i<i<L ^ normalized to 
a unit ^2-iiorm. The forward (analysis) transform is then defined by a non-necessarily 
square matrix e M^^" with L ^ n. When L > n the dictionary is said to be 
redundant or overcomplete. A dictionary matrix * is said to be a frame with bounds 



II 2 

ci and C2, < ci < C2 < +00, if ci ||a;|| ^ 



2 2 
^ C2 ||a;|| .A frame is tight when 



ci = C2 = c, i.e. = cl. 

In a synthesis prior model, the image x can be written as the superposition of the ele- 
mentary atoms ifii according to the following linear generative model x = X^i^i = 
^a. x is said to be sparse (resp. compressible)^ in * if it can be represented exactly 
(resp. approximately) as a superposition of a small fraction of the atoms in the dictionary 
# compared to the ambient dimension n. In an analysis-type prior model, the transform 
coefficients ^^x of the image x are assumed to be sparse. 

In the sequel, the dictionary $ corresponds to a frame and will be built by taking 
union of one or several transforms. For instance, for astronomical objects such as stars, 
the wavelet transform is a very good candidate [27]. For more anisotropic or elongated 
structures, ridgelets or curvelets would be a better choice [4]. 

3. Problem statement 

Consider the image formation model where an input image of n pixels x is blurred by 
a point spread function (PSF) h and contaminated by Poisson noise. The observed image 
is then a discrete collection of counts y = {yi)i-^i<^n which arc boimdcd, i.e. y G £oo- 
Each count yi is a realization of an independent Poisson random variable with a mean 
{h ® x)i, where ® is the circular convolution operator. Formally, this writes, 

yi^V{{h®x)i) . (1) 

This formula can be rewritten in a vector form as y 'P(Hx), where H is a circular 
convolution matrix. The deconvolution problem at hand is to restore x from the observed 
count image y. 



^With a slight abuse of terminology, in both cases we will use the term sparse. 
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A natural way to attack this problem would be to adopt a maximum a posteriori 
(MAP) bayesian framework with an appropriate likelihood function — the distribution 
of the observed data y given an original x — reflecting the Poisson statistics of the 
noise. As a prior, the image is supposed to be economically (sparsely) represented in 
a pre-chosen dictionary $ as measured by a sparsity-promoting penalty ^' supposed 
throughout to be proper, Isc and convex but non-smooth, e.g. the l\ norm. 

From the probability density function of a Poisson random variable, the likelihood 
writes: 

and the associated log-likelihood function is 

U{y\x) = J2 iy[i\ log((Ha;)[i]) - {Ux)[i\ - log(y[i]!)) . (3) 

i 

These formula are extended to the case y = 0^ using the convention 0! = 1. Taking 
negative log-hkehhood, we arrive at the following data fidelity term: 

n 

h :r?eM"^5^/poisson(r?[i]), (4) 
if > 0, /poisson (??[«]) 



-y[i] log(?7[i]) + 7?[i] if r?[i] > 0, 
-|-oo otherwise. 



if = 0, /poisson (??[«]) = 



Tl[i] if G [0,+oo), 
-|-oo otherwise. 



Our aim is then to solve the following optimization problems, with either an analysis- 
type prior, 

argmin J(a;), 
J : a; /i o H(a;) -h 7* o *'^(a;) -|-zc(a;) , 

or a synthesis-type prior, 

argmin J' (a), 
J' : a I-)- /i o H o *(a) -I- 7\l/(a) + o $(a) . 

h(oi) /30*(a) 

The penalty function ^ is additive, i.e. ^{a) = X]i^o^('^W)' 7 > is a regularization 
parameter and ic is the indicator function of the closed convex set C. In our case, C is 
the positive orthant since we are fitting Poisson intensities, which are positive by nature. 

In (P^,^,), the solution image x\ is directly targeted whose transform (analysis) co- 
efficients $ x\ arc the sparsest. While in problem (P^,,^^), we are seeking a sparse set of 
coefficients a*g and the solution signal or image is synthesized from these representation 
coefficients and the dictionary * as a:g = ^ag. 



Obviously, problems (P-y.tp) and (P^^^, ^) arc not equivalent in general unless $ is an 
orthobasis. For overcomplete non-orthogonal dictionaries, the solutions to synthesis and 
analysis formulations are different. Indeed, in the synthesis prior the set of solutions Xg 
is confined to the column space of the dictionary, whereas in the analysis formulation, 
the solutions a;^ are allowed to be arbitrary vectors in M". Furthermore, for redundant 
dictionaries, there are much fewer unknowns in the analysis formulation, hence leading to 
a "simpler" optimization problem, although the proximity operators become more com- 
plicated because of composition even for simple functions. As opposed to the analysis 
approach, the synthesis approach has a constructive form providing an explicit descrip- 
tion of the signals it represents, and as such, can benefit from higher redundancy to 
synthesize rich and complex signals. On the other hand, in the analysis approach, we 
can ask a signal or image to simultaneously agree with many columns of ^. This might 
become impossible with a highly redundant dictionary. 

It can be shown that for a Parscval tight frame (PTF) dictionary using the change 
of variable a — ^^x, an analysis-type prior formulation can be written as a linearly 
constrained synthesis-type prior formulation. The constraint ensures that a remains in 
the span of . Therefore, analysis- type prior forms can be seen as a subset of synthesis- 
type prior ones; see [17] for another geometrical argument. If a global minimizer of 
the synthesis-type prior satisfies the span analysis constraint, i.e. is a feasible analysis 
solution, then both problems are equivalent. Other equivalence conditions can be drawn 
if stronger assumptions (but useless to promote sparse solutions) are imposed on the 
penalty ^; see e.g. [17]. 

With the notable exception of [17], and a recent paper by [14] on compressed sensing 
(with PTF dictionaries), a little is known about the actual theoretical guarantees of 
analysis-type priors in general. From a practical point of view, there is no consensus 
either on the type of conclusions to be drawn from the experimental work reported in 
the literature. Some authors have reported results where the analysis prior is superior to 
its synthesis counterpart, e.g. [5, 17, 34]. Others have shown that they are more or less 
equivalent. In our work here, the synthesis prior turns out to be better on the Saturn 
image, presumably because the dictionary is very well suited to sparsely synthesize the 
image (see Section 6.3). 

In short, these phenomena are still not very well understood, especially for general 
inverse problems operators and dictionaries More investigation is needed in this 
direction to extricate the deep distinctions and similarities between the two priors. 

3.1. Properties of the objective functions 

From (P^,^) and (P^y_^), we have the following. 

Proposition 1. 

(i) fi is a convex function and so are /i o H and f\ o H o ^. 

(ii) fi is strictly convex i/Vi € {1, . . . ,n},y[i] ^ 0. /i o H remains strictly convex if 
ker(H) = 0, and so does /i o H o $ if ^ is an orthobasis and ker(H) = 0. 
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(Hi) The gradient of fi is 



V/i(77) = {givmi^i^n, (5) 
+00 else. 



ify[i\=0, 9{v[i]) 



1 ifv[i\>0, 
+00 else. 



(iv) Suppose that (0, +00) nH([0, +00)) 7^ and ^ has full domain. Then both J G 
ro(M") and J' e To{R^). 

Proof. The proof of (i) and (ii) follow from well-known properties of convex functions, 
(iii) is obtained by straightforward calculations, (iv): J and J' arc the sums of Isc 
convex functions which entails their convexity and lower semicontinuity. $ is a frame, 
hence surjective. Thus under the assumptions of (iv), the domains of both J and J' are 
non-empty, i.e. proper functions. □ 

3.2. Well-posedeness of {P-y,^,) and {P'^^^) 

Let A4 and Ai' be respectively the sets of minimizers of problems (P-y.^i,) and (P^^ ^). 
Suppose that 3i G {!,..., n} such that y[i] ^ 0, otherwise the minimizer would be 
trivially 0. Thus J and J' are coercive {^'^ is injective). Therefore, the following holds: 

Proposition 2. 

1. Existence: each of (P-y,^) and {P'^ ^) has at least one solution, i.e. M. ^ $ and 

M'y^<D. 

2. Uniqueness: (P^,^)) and {P'^^^) have unique solutions if ^ is strictly convex, or 
under (ii) of Proposition 1. 



4. Iterative minimization algorithm 

We are now ready to describe the proximal splitting algorithm to solve (P7,i/>) and 
(P^^). At the heart of this splitting framework is the notion of proximity operator which 
we are about to introduce. 

Proximity operator 

The proximity operator is a generalization of the projection onto a closed convex set, 
introduced by Moreau [28, 29, 30], 

Definition 3 (Moreau[28]). Let f <£ ro('H). Then, for every x G "H, the function 
y ^ f{y) + — y|| /2 achieves its infimum at a unique point denoted by pT:oxj:X. The 
operator proxj ■.H^H thus defined is the proximity operator of f. 
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The proximity operator enjoys many properties, among these, one can easily show 
that \/x,p e T-L 

p = proxy X <^ x-pe df{p) {y-p,x-p) + f{p) < f{y) eH , (6) 

which means that prox^ is the resolvent of the subdifferential of / [15]. Recall that the 
resolvent of the subdifferential df is the single- valued operator Jgf : "H — > "H such that 
yxGn,x-J9fix)GdfiJaf) ^ Jdf = iI + df)-\ 

4-2. A splitting algorithm for sums of convex functions 

Suppose that the objective function can be expressed as the sum of K convex, Isc 
and proper functions, 

K 

avgmm{f{x) = J2Mx)) ■ (7) 

Proximal splitting methods for solving (7) are iterative algorithms which may evaluate 
the individual proximity operators proxy^. (perhaps approximately), but never proximity 
operators of sums of the fi, and a fortiori that of /. The key idea relies on the formulation 
of (7) which is such that the proximity operator of each individual function has some 
(relatively) convenient structure, while those of the sums do not in general. This turns 
out to be true in our case for problems (P7,^) and (P'^ 

Splitting algorithms have an extensive literature since the 1970's, where the case 
K = 2 predominates. Usually, splitting algorithms handling K > 2 have either explicitly 
or implicitly relied on reduction of (7) to the (;asc! JiT = 2 in the product space T-L^ . For in- 
stance, applying the Douglas- Rachford splitting to the reduced form produces Spingarn's 
method [35], which performs independent proximal steps on each /j, and then computes 
the next iterate by essentially averaging the individual proximity operators. The method 
proposed in [9] is very similar in spirit to Spingarn's method, where moreover, it allows 
for inexact computation of the proximity operators. 

For every i e {!,..., K}, let {at^ijtm be a sequence in %. Let {xt)ten be the sequence 
as constructed by Algorithm 1. The following result due to [9] establishes the convergence 
of {xt)ten- We reproduce it here for the sake of completeness. 

Theorem 4 ([9]). Suppose that f is coercive and let {xt)t&i he a sequence generated by 
Algorithm 1 under the following assumptions, 

1. f is a proper function; 

2. (0, ... ,0) e ri{(.T - .Ti, . . . ,x - xk) | x G G dom/i, ...,xk& dom/if}; 

I G {1, . . . , K} Y.tm II < +00- 

Then {xt)ten converges to a (non-strict) global minimizer. 

Notice that the sequences (a(t,i))teN allow for some errors in the evaluation of the 
different proximity operators, with the proviso that these errors remain summable. This 
is useful when the closed forms of the proximity operators are difficult to construct and 
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Algorithm 1 



Task: Solve the convex optimization problem (7). 
Initialization: 

Choose /7, e (0,+oo), (p(o,i))i^i^K e U^, {oJi)i^i^K e (0,1]^ satisfying Yn^i'^i = ^ 

and xq = E^i^jP(o,i)- 
Main iteration: 
For f = to A'ext - 1, 

\lie{l,...,K} = prox^^./^^ p(t^i) + a(^t,i) , 

K 

Choose 6't e]0, 2[ , 

Vi e {l,...,/^} P(t+i,i) =P(t,i) +61* (2Ct - a;t - ^(t,i)) , 
xt+i = xt + 6t{^t - xt) . 

End main iteration 

Output: A solution of (7): Xn^^^- 



rather, must be computed via an inner iteration. This will turn to be the case for some 
terms involved in the two objective functions of (P-^,^) and (P^y^^)- More precisely, the 
difficulty at stake is how to deal with the composition with a bounded linear operator, 
here the circular convolution operator and/or the dictionary. This is what we are about 
to handle. 

4.3. Proximity operator of a pre- composition with a linear operator 

The following result will manifest its importance as a building block to solve the two 
problems (P-y.^) and iV'^ ^i). Indeed, it expresses the proximity operator of a function 
/ S ro('H) composed with an affine operator A : R" — >• W^,x i-> Fx ~ y, y & where 
F : ffi" — )• is a bounded Unear operator. 

Theorem 5. 

Let F be a linear bounded operator such that the domain qualification condition ri(dom(/)n 
Im(A)) ^ holds. Then / o A e ro(M™) and 

(i) IfF is a tight frame. Then, 

proXf^j^{x) = y + c"^F'^(prox^j -I)A(a;). (8) 

(a) If F is a general frame. Let Tt G (0, 2/C2). Let {ut)ten be sequence of iterates 

provided by Algorithm 2. Then, (ut)teN converges to u and {pt)teN converges to 
pmxfajs^x = X — F'^u. More precisely, these two sequences converge linearly and 
the best rate is attained for rt = 2/(ci + 02): 

||Pt-prox/oA(a:)|| < ^^( ca + ci ) l^o - Prox/oAla:)!! • (9) 



Algorithm 2 



Task: Forward-backward algorithm to compute proxjo^(a;). 

Parameters: The function /, the linear bounded operator A, number of iterations Ni^t 

and step-size tj G (0,2/c2). 

Initialization: Choose uq € M™, po = x — F'^uq. 
Main iteration: 
For f = to A^int - 1, 

Ut+l = Tt{I - pioxf/^Jiut/n + Apt), ^^^^ 
Pt+i =x- F'^ut+1. 

End main iteration 

Output: The proximity operator of / o A at x : pjvmt • 



(iiij In all other cases, apply Algorithm, 2 with Tt & (0, 2/C2). Then, {ut)tGN converges 
to u, and {pt)teN converges to proxyo^a; = x — F'^u at the rate 0{l/t). 

See the appendix for a concise proof. 

When F is a frame, the convergence speed of the one-step forward-backward scheme 
[23, 39] depends clearly on the redundancy of the frame. The higher the redundancy, the 
slower the convergence, i.e. the number of necessary iterations to obtain an £-accurate 
solution is O loge"^^ . For the general case (iii) where F is not a frame, the algorithm 

necessitates as large as 0{l/e) iterations to reach an e-accuracy on the iterates. 

Assessing the convergence rate as we have just done is important when this algorithm 
will be used in a nested scheme as a sub-iteration to compute the proximity operator of 
the composition with a linear operator. Indeed, Theorem 4 and the discussion thereafter, 
clearly states the stability of Algorithm 1 to errors in the individual proximity operators 
of the functions fj, i = ,K, provided that the errors remain summablc. In a 

nutshell, this means that if Algorithm 2 is used within Algorithm 1, a sufficient number 
of inner iterations -/Vjnt must be chosen to ensure error summability. This number of 
inner iterations obviously depends on the convergence speed of Algorithm 2, hence on 
the structure of F. 

4.-4- Proximity operator of a sparsity-promoting penalty 

To implement the above iteration, we need to express prox^^. The following result 
gives us the formula for a wide class of penalties V : 

Lemma 6. Suppose that ijj satisfies, (i) tp is convex even-symmetric , non-negative and 
non- decreasing on [0, -|-oc), and, V-'(O) = 0. (ii) "0 is twice differentiable on R \ {0}. (iii) 
tp is continuous on R, it is not necessarily smooth at zero and admits a positive right 
derivative at zero tp'^{Q) = lim;j_^Q+ > 0. Then, the proximity operator of ^'^{a), 
prox^^(a) has exactly one continuous solution decoupled in each coordinate ai : 

prox {a) - ^ ^^'+^^^ (11) 



A proof of this lemma can be found in [19]. A similar result is also proposed in 
[8]. Among the most popular penalty functions ^ satisfying the above requirements, 
we have tp{ai) = \ai\, in which case the associated proximity operator is the popular 
soft-thresholding. 

4-5. Proximity operator of the data fidelity f\ 

The following result can be proved easily by solving the proximal optimization prob- 
lem in Definition 3 with /i as defined in (4), see also [7]. 

Lemma 7. Let y be the count map (i.e. the observations), the proximity operator asso- 
ciated to /i (i.e. the Poisson anti log-likelihood) is, 

x\i\~^+^(x[i]-PY+Al^y[i\ \ _ ^^2) 



5. Sparse iterative deconvolution 

The game to play now is to plug the previous results to compute the proximity 
operators involved in problems (P7,i/;) and (P!^,^) in Algorithm 1. 



Algorithm 3 

Teisk: Image deconvolution with Poisson noise, solve (P^^^)- 

Parameters: The observed image counts y, the dictionary number of main iterations 

A/ext, number of sub- iterations iVint for the proximal operator of the data fidelity term, 

/X > and regularization parameter 7. 

Initialization: 

Vz €{0,1,2}, P(o,i) = *V 

ao = ^^V- 

Main iteration: 

For t = Qto Aext - 1, 

• Data fidelity (Lemma 7 and Theorem 5(i)-(iii)): ^(^^o) = Wo^^ifioiio^/sPitfi)- 

• Sparsity-penalty (Lemma 6): ^(t,i) = prox^^^/3P(t_i) = ST^7/3(p(t_i)). 

• Positivity constraint (Theorem 5(i)): ^(^^2) = WO^iqo^ /s{P{t,i))■ 
• Average the proximity operators: = {^(tfi) + +^(t,2))/3. 

• Choose (9* e]0,2[. 

• Update the components: V« e {0, 1, 2}, = P(t,i) + Ot{'^£,t - at- i{t,i))- 

• Update the coefficients estimate: at+i = at + (^t — a*) 

End main iteration 

Output: Deconvolved image x* = ^a^Vexf 
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5.1. Analysis-type prior 

For problem (P^^^), the different proxiraity operator are eomputed as follows: 

(1) proxy^^ojj is obtained owing to Theorem 5(iii) and Lemma 7; 

(2) proxj^o^T = prox^^o^T is given by Theorem 5(iii) and Lemma 6; 

(3) proxjg = proXj^ is directly is the euclidean projection onto the closed convex set C. 

5.2. Synthesis-type prior 

As far as problem (P^,,^^) is concerned, here is how the proximity operators are com- 
puted: 

(1) proxj^ojjo^ is computed using Theorem 5 and Lemma 7: use Theorem 5(i) and 
Theorem 5(iii) if $ is a tight frame, or Theorem 5(iii) otherwise; 

(2) prox^2 = prox^^ is given using Lemma 6; 

(3) proxjr^o^ = prox,^,Q^ is given by Theorem 5(i) or (ii) and the projector onto the 
closed convex set C: use Theorem 5(i) if # is a tight frame, or Theorem 5(ii) if it is 
a general frame; 

For example, let /2 be the fi-norm (i.e. tp = \.\) and fs = ic° where C is the positive 
orthant and * is a tight frame, then (P^y^^) is solved by Algorithm 3. 

5.3. Computational complexity and implementation details 

The bulk of computation of our deconvolution algorithm is invested in applying * 

(resp. H) and its adjoint (resp. H"""). These operators are never constructed ex- 
plicitly, rather they are implemented as fast implicit operators taking a vector a;, and 
returning #a; (resp. *'''a;) and Ha; (resp. H^'^a;). Multiplication by H or H""" costs 
two FFTs, that is 2nlogn operations (n denotes the number of pixels). The complexity 
of # and depends on the transforms in the dictionary: for example, the orthogo- 
nal wavelet transform costs 0(n) operations, the translation-invariant discrete wavelet 
transform (TI-DWT) costs 0(n\ogn), the curvclct transform costs O(nlogn), etc. Let 
denote the complexity of applying the analysis or synthesis operator. Define A^ext and 
Nint as the maximal number of outer and inner iterations respectively in Algorithm 1 
and Algorithm 2, and recall that L is the number of coefficients. The computational 
complexity of the algorithms for solving (P-y.^) and (P' ^) arc summarized as follows: 



Problem 


Computational complexity 




$ orthobasis or tight frame 


(P7,^) 


A^ext (Aint (40(71 log n) + 2V^) + 0{n)) 




Aext (4y* + Aint (4nlogn) + 0{L)) 




$ general frame or linear bounded operator 


(P7,V) 


Aext (Ai„t (40(nlogn) + 21/*) + 0{n)) 




Aext (Aint (4y* + 4nlogn + 0{L))) 
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6. Results 



6.1. Experimental protocol 

We applied the proposed algorithms on a simulated image of the Hubble Space Tele- 
scope Wide Field Camera of a distant cluster of galaxies [38] and an image of the planet 
Saturn. In the sequel, our algorithms are respectively dubbed Prox-FA (for the one 
with the analysis prior) and Prox-FS (for that associated to the synthesis prior). For 
both of them, the number of interior iterations iVint was set to 10 which was enough to 
ensure convergence (i.e. the error terms in the computation of the proximity operators 
were under control). The obtained results were compared with others state-of-the art 
algorithms: the naive approach treating the noise as if it were additive white Gaiissian 
(NaiveG [42]), the Anscombe variance stabilizing approach (StabG [13]), and the algo- 
rithms proposed in [21] with the synthesis (PIDAL-FS) and the analysis (PIDAL-FA) 
priors. For all compared algorithms, the sparsity-promoting penalty was the £i-norm 
whose proximity operator is well-known soft-thresholding operator. For fair comparison, 
all algorithms were stopped when the relative change between two consecutive iterates 
was below a given tolerance 5 > 0, i.e. 



The quantitative score of restoration quality was the the mean absolute error (MAE). 
The MAE was chosen as it is well suited to Poisson noise [13]. 

As usual in regularized inverse problems, the choice of the regularization parameter 
7 is crucial. When the reference image is available, its choice can be tweaked manually 
so as to maximize the restoration quality by minimizing some quality score; e.g. MAE. 
But of course, this is a tedious task and cumbersome for real data. To circumvent these 
difficulties, in [13], an objective model selection driven choice based on the GCV was 
proposed. An approximate closed- form of the GCV was derived (see for [13] details). 



where x* is the restored image provided by the estimator. It was shown in [13] that 
the GCV-based choice of 7, although turned out to be close to the value that actually 
minimizes the MAE and the MSE, was slightly upper-biased in practice. The GCV then 
was used as a first good guess that might need refinement. 

6.2. Simulated sky image 

Figure 1(a) displays a simulated image of the sky. The maximal intensity (mean 

photon count) in this image is ^18000. The blurred version by h (using the Hubble's 
PSF before being repaired [38]) and the noisy ones are depicted in Figure 1(b) and 
(c). For this image, the dictionary * contained a tight frame of translation-invariant 
wavelets (TI-DWT). For this image, we also include in the comparative study the de- 
convolution obtained with Richardson-Lucy regularized with the wavelet multiresolution 
support (RL-MRS) as proposed in [36]. The latter was specifically designed for this kind 



l|a;t+i -xtlla/llxtlla < 5 . 



(13) 




(14) 



with df « card {i = 1, . . . ,L 
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RL-MRS [38] 


RL-TV [12] 


FTITPR [44] 


NaiveG [42] 


StabG [13] 


MAE 


63.5 


52.8 


60.9 


39.8 


43 




PIDAL-FA [21] 


PIDAL-FS [21] 


Prox-FA 


Prox-FS 




MAE 


40 


43.6 


37.8 


35.6 



Table 1: MAE for the deconvolution of the sky image. 



of objects. We also compare to the total- variation regularized Richardson-Lucy algorithm 

(RL-TV) proposed in [12], and the fast translation invariant tree-pruning reconstruction 
combined with an EM algorithm (FTITPR [44]). For each compared method, the reg- 
ularization parameter was tuned manually to achieve its best performance level (i.e. 
minimize MAE), and the convergence tolerance S was set to 10~^. 

For this image, the best algorithm is clearly RL-MRS where most of the small faint 
structures as well as bright objects are well deconvolved. However, small artifacts from 
the wavelet transform are also present. The StabG algorithm does a good job and 
preserves most of the weak objects which are barely visible in the degraded noisy image. 
At this intensity regime, the NaiveG algorithm yielded satisfactory results, which are 
comparable to ours. FTITPR correctly restores most of the important structures with 
a clean background, but many faint objects are lost. RL-TV leads to a deconvolution 
similar to ours for bright objects, but the background is dominated by spurious artifacts. 
The PIDAL approaches give quite similar results, PlDAL-FA leads here to a smoother 
estimate than PlDAL-FS. In the other hand PlDAL-FS seems to preserve more sources, 
as it can be seen for example on the bottom left, where one can notice objects that do 
not appear in the results of PIDAL-FA. In the same way, our algorithms yield results 
comparable to their respective parts of PIDAL as expected since they solve the same 
optimization problems. Prox-FS shows one of the best deconvolution results with a 
clean background and good restoration of most objects, although a few small details are 
oversmooth (see on bottom left). 

These qualitative results are confirmed by a quantitative score of the restoration 
quality in terms of the MAE, see Table 1. Notice that the hight MAE value of RL-MRS, 
which might seem surprising given its good visual result, might be explained by the 
artifacts that locally destroy the photometry. 

6.3. Intensity level influence 

To assess the impact of the intensity level (average photon counts), the different 
algorithms in the comparative study wcirc applied to an image of the planet Saturn at 
several intensity levels. In this experiment, the PSF was a 7 x 7 moving average. The 
regularization parameter 7 was chosen using the GCV criterion (14). Given that this 
image is picccwise smooth away from smooth curvilinear structures (e.g. the rings), the 
curvelet dictionary will be a very good candidate to sparsify it [4] . 

At each intensity value, ten noisy and blurred replications were generated and the 
MAE was computed for each deconvolution algorithm. The average and relative (i.e. 
divided by the mean intensity) MAE values over the ten realizations are summarized 
in Table 2. As expected, for low intensity levels, the best results are obtained for the 
methods using data fidelity terms that account properly for Poisson noise. For higher 
levels, the performance of NaiveG gets better as the Poisson distribution tends to normal 
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Intensity Level 


NaiveG 


StabG 


PIDAL-FA 


PIDAL-FS 


Prox-FA 


Prox-FS 


5 


0.70 (33.49) 


1.14 (54.54) 


0.34 (16.04) 


0.24 (11.78) 


0.42 (20.09) 


0.25 (11.96) 


30 


1.27 (10.13) 


6.62 (52.78) 


4.22 (33.62) 


1.41 (11.23) 


2.35 (18.74) 


2.09 (16.66) 


100 


3.24 (7.75) 


26.73 (63.94) 


23.40 (55.97) 


12.46 (29.81) 


7.88 (18.85) 


7.31 (17.49) 


255 


7.48 (7.02) 


80.45 (75.46) 


73.83 (69.26) 


49.36 (46.30) 


21.02 (19.72) 


18.78 (17.62) 



Table 2: Average MAE values and relative MAE (in parentheses and in percent) for the Saturn image as 
a function of the intensity level with the naive Gaussian (NaiveG) [43], stabilized gaussian (StabG) [13], 
PIDAL-FA and PIDAL-FS [21], and our two algorithms Prox-FA and Prox-FS. 



with increasing intensity. The difference in performance, especially for the high inten- 
sity regime, between our approaches and the PIDAL ones seem to originate from the 
optimization algorithms used (and their convergence then). 

For this image, it seems that the synthesis prior is slightly better than the analysis 
one. This might seem contradictory with results reported in the Gaussian noise case 
by a few authors where the analysis prior was observed to be superior to its synthesis 
counterpart [5, 17, 34]. But one should avoid to infer strong conclusions from such limited 
body of experimental work. For instance, the noise is different in our setting, and the 
deep phenomena underlying differences and similarities between analysis and synthesis 
priors are still not very well understood and more investigation is needed in this direction 
both on the theoretical and practical sides. 

The visual results obtained for a maximal intensity of 30 are portrayed in Figure 2. For 
all the methods, the curvelet transform is able to capture most of the curved structures 
inside the image and then leads to a good restoration of them. The original image 
contains some tiny satellites at the bottom right that appear as localized spikes. Prox- 
FA was able to recover one of them, though it is hardly visible in the restored image. 
Prox-FS however recovers several structures that look like background artifacts including 
the satellites. A wise solution would be to use another transform in the dictionary beside 
curvelets, which is well-adapted to point-like structures (e.g. wavelets or even Diracs). 

7. Conclusion 

In this paper, a novel sparsity-based fast iterative thresholding deconvolution algo- 
rithm that takes account of the presence of Poisson noise was presented. The Poisson 
noise was handled properly using its associated likelihood function to construct the data 
fidelity term. A careful theoretical study of the optimization problems and convergence 
guarantees of the iterative algorithms were provided. Several experimental experiments 
have shown the capabilities of our approach, which compares favorably with some state- 
of-the-art algorithms. 

The present work may be extended along several lines. For example, our approach 
uses a proximal algorithm which is a generalization of the Douglas-Rachford splitting 
applied to the primal problem, but to the best of our knowledge, its convergence rate 
(even on the objective) is not known. Furthermore, in this paper, the regularization 
parameter was chosen based on the GCV formula proposed, see (14). But this formula 
was derived using some heuristics and was valid for the stabilized version of the deconvo- 
lution problem. This objective and automatic selection of the regularization parameter 
remains an open issue that is worth investigating in the future. 
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Appendix A. Proof of Theorem 5 

As A is an afEne operator, it preserves convexity. Thus, / o A is convex. Moreover, 
A is continuous and / is Isc, and so is / o A. By the domain qualification condition, 
we then have / o A G ro(R"'). Note that the domain assumption in the theorem is not 
needed for frames by surjectivity of F. 

By Fenchel-Rockafellar duaUty [16, Section III.4], we have 

1 2 

p = prox <^ p = arginf - Hp - xll + / o A(p) (A.l) 

^"■^ PGR" ^ 



u = argmm - 



X - F^u 



+ {u,y) + r{u), (A.2) 



and the primal solution p is recovered from the dual solution u as 

p = X — F'^u . 

(i) F is a tight frame with constant c > 0. Applying the minimality condition to (A.2), 
using the fact that FF^ — cl and Moreau Identity, we obtain 

(Fx — y) — cu e df*{u) 
c-^{Fx-y)-u e d{c'^f*){u) 

u = proXg-i^. (c~^(Fa; — y)) 

u = (I - prox,^) (Fa; - y) , (A.3) 

which leads to (8). 

2 

(ii) F is a general frame operator. From (A.2), j x — F . + is continuous 

with C2-Lipschitz continuous gradient. Therefore, applying the forward-backward 
splitting to the minimization problem (A.2), we obtain (10). Prom strong convexity 
of the dual problem (A.2), the dual minimizer is unique. 

To get the linear convergence rate, the proof is very classical and is an extension of 
the projected gradient descent one that can be found in classical monographs such 
as [6, Theorem 8.6.2]. 

(iii) In all other cases, the forward-backward splitting applied to (A.2) converges pro- 
vided that < inf/, T/, < sup, t/, < 2/|F[||^ = 2/c2. The proof of the convergence 
rate is technical and follows the same lines as [18, Theorem 1]. 
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Figure 1: Deconvolution of the sky simulation, (a) Original, (b) Blurred, (c) Blurred and noisy, (d) 
RL-MRS [38], (e) RL-TV [12], (f) FTITPR [44], (g) NaiveG [42], (h) StabG [13], (i) PIDAL-FA [21], (j) 
PIDAL-FS [21], (k) Prox-FA, (1) Prox-FS. 18 
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Figure 2: Dcconvolution of Saturn image, (a) original, (b) Convolved, (c) Convolved and Noisy, (d) 
NaiveG [43], (e) StabG [13], (f) PIDAL-FA [21], (g) PIDAL-FS [21], (h) Prox-FA, (i) Prox-FS. 
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